I have written script to make journal abbreviations. You may find it in the Document editing page/
Convert xyz to RISM system (via pdb)
Script uses VMD, MAESTRO,MATLAB,BASH,PYTHON. Please check whether all of them are installed properly.
This script does the following: 1) Takes txt file(e.g Guthrie.txt ), and splits it into the xyz files 2) convert xyz to pdb using VMD 3) assign OPLS2005 force field parameters using MAESTRO, and takes the coordinates from pdb to separate file 4) run MATLAB script, which merges coordinates and force-field and put it into the DataSet
Maybe you need only PARTS of the script. Then read file convert_all - there are very good scripts, which perform steps 1-4) separately.
The script is not well documented, but there are comments in convert_all. Not so much but... sapienti sat
Example of using the script. Note first line of example copy function cellcat2 to the working folder.
If you have the folder of our database in your paths - you may omit this line
"###" Changing the user name from "saburdja" to the name read from the command line
cd Bibliography
cp -f bibtexfilegeneral_linux.bib $name".bib"
cmd="sed -i s/saburdja/"$name"\/svn/g "$name".bib"
echo $cmd
$($cmd)
"##" Running the JabRef
cd "/usr/people/"$name"/distr"
java -jar JabRef-2.5.jar
File conversion
xyz2coors
This command takes all files with xyz extension (coordinates file format) and greps coordinates out of there and saves them in file.coors:
for i in *.xyz ; do echo $i; fn=${i%.*}; eval tail -n +3 "$i" | gawk '{print $2 " " $3 " " $4}' > "$fn".coors ; done
dx2dat.py
The script converts *.dx file format into *.dat.
*.dx contains 3D distribution (out from NAB-3DRISM)
*.dat is just a column of 3D data.
Mind: z varies fastest, y medium, x is the slowest.
Also as an output one gets x.dat, y.dat, z.dat - the support in each direction
The script saves 3D distribution in matlab to the *.dx file format.
G3DtoDX(GX,GY,GZ,g3D,dxFileName)
GX,GY,GZ - grids in x,y,z directions ( size [ Nx x 1], [Ny x 1] [Nz x 1])
g3D - three dimensional distribution (size [ Ny x Nx x Nz ])
dxFileName - name of file, there the distribution will be saved
The script converts *.xvv file format into *.dat.
*.dx contains 1D solvent susceptibility functions in k-space (output from rism1d in AmberTools)
*.dat: 1st column - support, 2nd-Xth - susceptibility functions .
Mind: standard Amber output order - has to be clarified.
The script converts *.top and a *.pdb file format into *.rismmol. TOP and PDB must be consistent!
*.top contains the topology and the force field parameters of a molecule (generated by Antechamber and TLeap of AmberTools). Use pdb generated by TLeap.
*.rismmol is the the toplogy for the RISM-MOL program. (x,y,z,sig,eps,q)
By default the oarameters in top file are: epsilon[kcal/mol], Rmin/2[A], q[18.2223*|e|]
in rismmol: x[A], y[A], z[A], epsilon[kcal/mol], sigma[A], q[|e|]
USAGE: ambertop2rismmol.py mol.top mol.pdb mol.rismmol
converts the data
USAGE2: ambertop2rismmol.py mol.top mol.pdb mol.rismmol CHECK
converts the data and prints additional output into stdout
Find below the script for extracting coordinates from Gaussian output.
It takes 2 arguments:
1) name of molecule (name of log file without extension)
2) number of atoms in the molecule
it prints to the standard output coordinates in xyz format.
Let's assume,that filename of the script is: g03log2xyz
Then usage is:
./g03log2xyz [Molecule Name] [Number of atoms] > [xyz output file]
The script converts the force field created by the ffld_server of the Schrodinger Maestro utility to MCCCS Towheeinout files. See the end of the scripts to set the input and output file names.
function [err,mue,muc,vuac,mucSDC,b]=FitSetWithSDC(DSP,Set,SetName,MM,mu_exp,X,b,key)
% EXAMPLE USAGE
%%% [err2,mue2,muc2,vuac2,mucSDC2,b2]=FitSetWithSDC(DataSetsPath,Names2,'JPCB_3DRISM_FROLOV','GF',mu_exp,X_TeS,b1,'CALC');
% Set = cell array with the nanes of compunds
% SetName - how it is called in DataSets directory
% MM - MuMethod
% mu_exp - structure array with the experimental mu data
% X - the table with descriptors (WITHOUT a0 and a1 !!!!!!!)
% b - list with SDC coefficients
% key='FIT' (to fit the data and out the coeeficients in b) or key='CALC' (use the coefficients from b)
rho=0.0333;
DataSetsPath = DSP;
[DU,EU]=unit_names;
%SetName='JPCB_3DRISM_FROLOV';
%MM='GF';
%%% LOAD CALCULATED MU DATA
P = getDataSets(DataSetsPath,SetName,'mu');
[FL,KL,KFL,PL]=multiSelect(P,'MuMethod',MM);
R=do_something('R=load(filename);',FL,KL,KFL,PL);
a=showResult(KL,R,'Name');
mu_calc = CellNumberArray2MatArray( CellMart2CellColumn(a,2));
nam_calc = CellMart2CellColumn(a,1);
%%% LOAD CALCULATED PMV DATA
P = getDataSets(DataSetsPath,SetName,'VUA');
[FL,KL,KFL,PL]=multiSelect(P);
R=do_something('R=load(filename);',FL,KL,KFL,PL);
a=showResult(KL,R,'Name');
vua_calc = CellNumberArray2MatArray( CellMart2CellColumn(a,2));
nam_calc_vua = CellMart2CellColumn(a,1);
%%% LOADING THE DATA FOR THE SPECIFIED SET
mue=[]; muc=[]; vuac=[];
N = length(Set);
for i = 1:N
name=Set{i};
sname=alphanumeric(name);
try
mue(i)=mu_exp.(sname)*unit2unit(EU,'kJ/mol','kcal/mol');
catch exception
sname = [sname '_'];
mue(i)=mu_exp.(sname)*unit2unit(EU,'kJ/mol','kcal/mol');
end
if strcmp(name,'nonan-5-one') ; name='nonan-5-on'; end
[val,ind1]=intersect(nam_calc,name);
if isempty(val)
[val,ind1]=intersect(nam_calc,[ upper(name(1)) name(2:end)]);
tmp='Upper case !!!!'
val
end
if (length(ind1) ~= 1) ; tmp='LENGTH(IND1) ~= 1 WARNING!!!!!!'
name
end
muc(i)=mu_calc(ind1);
[val,ind2]=intersect(nam_calc_vua,name);
if isempty(val)
[val,ind2]=intersect(nam_calc_vua,[ upper(name(1)) name(2:end)]);
tmp='Upper case (VUA)!!!!'
end
vuac(i)=vua_calc(ind2);
if (ind1 ~= ind2) ; tmp='IND1 ~= IND2 WARNING!!!!!!'
end
if strcmp(key,'FIT')
tmp='Performing FIT!!!, b is calculated from regress'
b=[];
b=regress(Y,X);
elseif strcmp(key,'CALC')
tmp='Performing CALC!!!, b is taken as an input'
else
tmp='key must be FIT or CALC, exiting !!!'
err=struct; mucSDC=[];
return
end
if sum(ismember('ane',name))==3
if sum(ismember('chloro',name))==6
class='chloroalkane';
chloroalkanes=[chloroalkanes,{name}];
else
class='alkane';
alkanes=[alkanes,{name}];
end
elseif (sum(ismember('benzene',name))==7 || sum(ismember('toluene',name))==7 || sum(ismember('xylene',name))==6)
if sum(ismember('chloro',name))==6
class='chlorobenzene';
chlorobenzenes=[chlorobenzenes,{name}];
else
class='alkylbenzene';
alkylbenzenes=[alkylbenzenes,{name}];
end
else
a=0
end
Bibtex file with entries which are used in Texfile
If you have tex document and a general big Bib file for it, then probably you want to generate a smaller Bib file with the Bib entries which are only used by your Tex file.
The following two scripts do that:
#!/bin/bash# Created by Andrey I. Frolov, 8 Jan 2011# Usage: GrepRefsFromTex.sh TEXFILE > REFSFILE## The script greps all the references from the tex file (the names which present in \cite{ ... } tags)f=$1tmp='TMPFILE_JUSTANYSTUPIDNAME.tmp'tmp2='TMPFILE_JUSTANYSTUPIDNAME_2.tmp'cp"$f""$tmp"### I.# Removing all EOL (end of lines) from the filesed-i':a;N;$!ba;s/\n/\t/g'$tmp### II.# Replace all '}' symbols by the EOL symbols (by "Enter" keybord key)sed-i's/\}/\n/g'"$tmp"### III.# Grep citations:# 1) cat TMP : lst the file# 2) grep -o "cite{.*" : grep all words starting from 'cite{' till the enf of line# 3) sed 's/cite{//g' : remove all 'cite{' entries# 4) sed 's/\ //g' : remove all spaces# 5) gawk -F , '{out=""; for(i=1;i<=NF;i++){print $i}}' : split the strings by ',' sparator, print all the fieldscat"$tmp"|grep-o"cite{.*"|sed's/cite{//g'|sed's/\ //g'|gawk-F , '{out=""; for(i=1;i<=NF;i++){print $i}}'>"$tmp2"### IV.#sort -u FN : print out uniq entriessort-u"$tmp2"rm$tmp$tmp2
#!/bin/bash# Created by Andrey I. Frolov, 8 Jan 2011# Usage: GrepBibForRefs BIGBIBFILE REFFILE > NEWBIBFILE## The script greps from the BIGBIBFILE file the only BIB entries for the references names prresent in REFSFILE.bib=$1ref=$2for name in`cat$ref` ; do#echo $name### Check if there is the Ref in the Bib file. Print the output into standard error.n=`cat$bib|grep"@.*$name,"|wc -l`if["$n"-le"0"] ; thenevalecho"WARNING! No "$name" field in BIB file">&2 ; continue ; fi### Grep all the lines between the name "@.*{$name," and "@". Mind @ is not a special character!echoawk"/@.*{$name,/ {print;flag=1;next} /@/ {flag=0} flag { print }"$bibechodone
generate dependencies for C++ file ]]
Below is the script, which generates dependencies for the C++ file (search for includes recursively and gives the full list of c,cpp files for the particular one file)
extract includes is a helper script
Calculates potential of the solute atoms in the water solution
Parameters:
r - vector with distance samples
g - g_sa(r): RDF functoions of all atoms.
columns 1,3,5,...2s-1,... are rdfs of s-th solute atom with water oxygen
columns 2,4,6,...,2s,... are rdfs of s-th solute atom with
water hydrogen
density - number density of a solute ( in Bohr^-3)
charge_o - partial charge of water oxygen (without a sign)
Usage: Phi=calc_Phi(r,g,density,charge_o)
convert matrix element by element to the cell array
Horizantally concatanates cell-matrix Cell and real-matrix Real.
Matrices should have the same number of rows
INPUT PARAMETERS:
Cell - cell matix
Real - real matrix
OUTPUT PARAMETER:
M - cell matrix: result of concatanation.
If Cell has m row n cols, Real has m row k
cols, M has m rows n+k cols. First n cols of M are the same as
Cell, last k columns are columns from Real
CUB - 3D matlab array with data values
x0,y0,z0 - coordiantes of the corner of the cube
dx,dy,dz - grid steps in each direction
fname - name of the output file
NXYZ - coordinates of the atoms
first column - atom number in the periodic table
last three - atom coordinates
dsxy2figxy -- Transform point or position from data space
coordinates into normalized figure coordinates
Transforms [x y] or [x y width height] vectors from data space
coordinates to normalized figure coordinates in order to locate
annotation objects within a figure. These objects are: arrow,
doublearrow, textarrow, ellipse line, rectangle, textbox
Usage: Obtain a position on a plot in data space and
apply this function to locate an annotation there, e.g.,
[axx axy] = ginput(2); (input is in data space)
[figx figy] = dsxy2figxy(gca, axx, axy); (now in figure space)
har = annotation('textarrow',figx,figy);
set(har,'String',['(' num2str(axx(2)) ',' num2str(axy(2)) ')'])
Copyright 2006-2009 The MathWorks, Inc.
Convert the number to the tex-comatible string format with 10^ notation
Convert the number to the tex-comatible string format with 10^ notation
Usage: strV=fine_num(V)
Make Badges for a workshop
Below is example script, which makes badges for the DUEL 2011 workshop.
How it works?
It takes txt files where Name, surname, affiliation is listed, devided by ';'
Fist column - title,
Second - Surname
Third - Given Name
Last - affiliation
Also it takes a "preambule" - part of tex document,
compiles the names and affiliation to the tex file which represents badges
And then - compiles this text file and produces PDF.
All necessary files are below:
Main Script (just run it)
This script is useful if you for example want to move the RDFs obtained from the MD simulations to other destination, if all these RDFs are in different folders...
INPUT PARAMETERS:
EulerAngles - euler angles [ psi, theta, phi ]
XYZ - coordinates of the molecule, each COLUMN - coordinates of the
atom ( not each ROW!, be careful)
OUTPUT PARAMETERS:
Rot - rotation matrix. To obtain ne coordinates : X'=Rot*X
XYZ1 - new coordinates XYZ1 = Rot*XYZ
Duplicate a cubic water box in order to obtain a HUGE water box (> 1M atoms)
Coloring molecules by any quantity (A VMD Tcl script)
A VMD Tcl script to color molecules by any quantity
in this special case the dihedral angle is choosen and directly calculated by vmd
to start with the following line within Tcl console
source vmd_color_moldules_by_dihedralangle.tcl
to start with the following line from terminal
vmd -e vmd_color_moldules_by_dihedralangle.tcl
Every molecule can get a specific beta value (B-factor, Temperature factor, see http://proteopedia.org/wiki/index.php/Disorder )
The value ranges from 0 to 1 and is connected with a specific color representation in vmd (Beta)
0 ... 0.25 ... 0.5 ... 0.75 ... 1
red ... pink ... white ... violet ... blue
Starts every 15 minutes, asking what are you doing and writes resuts to log file.
HOW TO INSTALL:
create the folder ~/TaskLogger (cannot be changed, to change you need to modify taskLogger script)
Dowload the following files to this folder:
Here meaning of the parameters is the following:
variables: names of the iterators
lists: the values of the iterators
fn: the function which takes one argument (data)
fn_param: arbitrary function params. Dictonary which will contain additional data available for the iteration function
the function fn will be called for all combinations of the iterator values from the lists.
The values of the iteratiors are passed to the function fn using the dictionary data.
The keys of the dictionary are the same as the variables parameter passed to the iterate_over_lists function.
Also, additional parameters will be set in the data dictionary to be the same as fn_params.
For more information see iterate_over_lists_readme
Scripts
Here we may put some useful scripts, whatever they do
Journal Abbreviations in Bibtex
Convert xyz to RISM system (via pdb)
Update the Bibliography
File conversion
xyz2coors
dx2dat
3D distribution in matlab to *.dx
xvv2dat
amb2gmx
ambertop2rismmol
rismmolRDFs2gvv
gvv2rismmolRDFs
rism2ambertop
ffld_server2towhee
determine chemical class of compound by pdb file
Script which does fitting or calculation of 1D-RISM
Separate set of names to classes
Assign OPLS-AA (2005) FF parameters
Generate Amber Topology
Bibtex file with entries which are used in Texfile
generate dependencies for C++ file
write many one-column files in many columns format
Make Badges for a workshop
Create the Path with all sub folders
Moves from one folder to another with Filter, preserving the folder structure
read PDB file from C
Java program to log your activity during the day
Iterate Over Lists
Useful MATLAB scripts and functions
Calculate potential around the spherical solute
convert matrix element by element to the cell array
Convert matlab cell array to latex table
Convert matlab cell array to excel compatible *.csv file
Concatanate cell matrix and Real matrix in matlab
Save 3D data array to the Gaussian Cube file
Transform point or position from data space
Convert the number to the tex-comatible string format with 10^ notation
Print the time in seconds in HMS (hour-minute-second) format
Plot data and labels for it
rotate the 3D molecule by Euler angles
maximum / minimum of multidimensional array
restore plot samples from image
Bash scripts for processing Gromacs input / output files
Duplicate a cubic water box in order to obtain a HUGE water box (> 1M atoms)
Coloring molecules by any quantity (A VMD Tcl script)
Journal abbreviations in BibTeX
I have written script to make journal abbreviations. You may find it in the Document editing page/Convert xyz to RISM system (via pdb)
Script uses VMD, MAESTRO,MATLAB,BASH,PYTHON. Please check whether all of them are installed properly.
This script does the following:
1) Takes txt file(e.g Guthrie.txt ), and splits it into the xyz files
2) convert xyz to pdb using VMD
3) assign OPLS2005 force field parameters using MAESTRO, and takes the coordinates from pdb to separate file
4) run MATLAB script, which merges coordinates and force-field and put it into the DataSet
Usage: ./convert_all [txt filename] [DataSet] [vmd command] [DataSetsPath]
Defaults: [txt filename] : Guthrie.txt
[DataSet] : SAMPL1_ru
[vmd command] : vmd
[DataSetsPath] : /net/v215-2/data4/fedorov-group/Database/DataSets
To get this script from svn
1) go to the folder, where your input files are
2) type in terminal:
svn export http://triton.mis.mpg.de/svn/fedorov/Development/utils/xyz2system.tar
To unarchive the script type in the terminal
tar -xf xyz2system.tar
To run this script type in terminal
./convert_all
Maybe you need only PARTS of the script. Then read file convert_all - there are very good scripts, which perform steps 1-4) separately.
The script is not well documented, but there are comments in convert_all. Not so much but... sapienti sat
Convert matlab cell array to latex table
There is a MATLAB script which converts CELL array to the latex source file with table.
(read more about matlab cell arrays )
To obtain the script from the SVN type in terminal
svn export http://triton.mis.mpg.de/svn/fedorov/Development/utils/cell2tex.m
or just download the code:
Usage of the script:
cell2tex(cell array,tex file name)
Example of using the script.
Note first line of example copy function cellcat2 to the working folder.
If you have the folder of our database in your paths - you may omit this line
1) Run Matlab
2) Type
copyfile /net/v215-2/data4/fedorov-group/Database/bin/cellcat2.m .
or also you may download cellcat2.m :
A=[3 1 4; 1 5 9; 2 6 5]
Head= {' ','A','B','C'}
C=[ Head;cellcat2({'X';'Y';'Z'},A)]
cell2tex(C,'a.tex')
type a.tex
!pdflatex a.tex >/dev/null
!acroread a.pdf &
An example for the sets of solutes from our JPCB_2010:
1) To obtain necessary scripts from the SVN type in terminal
svn export http://triton.mis.mpg.de/svn/fedorov/Development/RISM-MODULES/five_curve/Ratkova/RISMtable.m
svn export http://triton.mis.mpg.de/svn/fedorov/Development/RISM-MODULES/five_curve/Ratkova/separClasses.m
svn export http://triton.mis.mpg.de/svn/fedorov/Development/RISM-MODULES/five_curve/Ratkova/matrixReorder.m
svn export http://triton.mis.mpg.de/svn/fedorov/Development/RISM-MODULES/five_curve/Ratkova/cell2tex_rism2.m
2) Run script RISMtable.m
RISMtable(Names1,Names2,VUA1,VUA2,rho,Mu1X,Mu1Y,Mu2X,Mu2Y,MuExp1,MuExp2)
Please, read comments in the script! Now it is working for classes:
chem_class={'alkane'
'alkene'
'alkylbenzene'
'alcohol'
'phenol'
'chloroalkane'
'aldehyde'
'ketone'
'ether'
'polyfragment'};
and prepare table with following columns:
Head2= {'Name','class','\rho V','\Delta G^{GF}','\Delta G^{RISM-UC}','\Delta G^{exp}'};
Update the Bibliography
The script is sort of working... will be updated... :)
#!/bin/bash
if [ "$1" == "-h" ] || [ "$1" == "--help" ] || [ "$1" == "-?" ]; then
echo "Usage: ./Bibliography.sh UserName"
exit 0
fi
if [ "$#" -lt 1 ] || [ "$#" -gt 1 ]; then
echo "Usage: ./Bibliography.sh UserName"
exit 0
fi
name=$1
cd /usr/people/$name/svn/Main
svn co http://triton.mis.mpg.de/svn/fedorov/Main/Bibliography
echo $name
"###" Changing the user name from "saburdja" to the name read from the command line
cd Bibliography
cp -f bibtexfilegeneral_linux.bib $name".bib"
cmd="sed -i s/saburdja/"$name"\/svn/g "$name".bib"
echo $cmd
$($cmd)
"##" Running the JabRef
cd "/usr/people/"$name"/distr"
java -jar JabRef-2.5.jar
File conversion
xyz2coors
This command takes all files with xyz extension (coordinates file format) and greps coordinates out of there and saves them in file.coors:dx2dat.py
The script converts *.dx file format into *.dat.*.dx contains 3D distribution (out from NAB-3DRISM)
*.dat is just a column of 3D data.
Mind: z varies fastest, y medium, x is the slowest.
Also as an output one gets x.dat, y.dat, z.dat - the support in each direction
USAGE: dx2dat.py mol.dx > mol.dat
3D distribution in matlab to *.dx
The script saves 3D distribution in matlab to the *.dx file format.
G3DtoDX(GX,GY,GZ,g3D,dxFileName)
GX,GY,GZ - grids in x,y,z directions ( size [ Nx x 1], [Ny x 1] [Nz x 1])
g3D - three dimensional distribution (size [ Ny x Nx x Nz ])
dxFileName - name of file, there the distribution will be saved
[#xvv2dat]]xvv2dat.py
The script converts *.xvv file format into *.dat.*.dx contains 1D solvent susceptibility functions in k-space (output from rism1d in AmberTools)
*.dat: 1st column - support, 2nd-Xth - susceptibility functions .
Mind: standard Amber output order - has to be clarified.
USAGE: xvv2dat.py mol.xvv > mol.dat
ambertop2rismmol.py
The script converts *.top and a *.pdb file format into *.rismmol. TOP and PDB must be consistent!*.top contains the topology and the force field parameters of a molecule (generated by Antechamber and TLeap of AmberTools). Use pdb generated by TLeap.
*.rismmol is the the toplogy for the RISM-MOL program. (x,y,z,sig,eps,q)
By default the oarameters in top file are: epsilon[kcal/mol], Rmin/2[A], q[18.2223*|e|]
in rismmol: x[A], y[A], z[A], epsilon[kcal/mol], sigma[A], q[|e|]
USAGE: ambertop2rismmol.py mol.top mol.pdb mol.rismmol
converts the data
USAGE2: ambertop2rismmol.py mol.top mol.pdb mol.rismmol CHECK
converts the data and prints additional output into stdout
rism2ambertop
How to use:
- download
- unpack
./rism2ambertop input.rism output.top output.inpcrd [output.pdb]
- if it does not work - recompile it:
make r
rismmolRDFs2gvv
gvv2rismmolRDFs
Reorder the gvv file,which is produced by AmberTools 1DRISM program to the format, used in rism-mol (& RISM-MOL 3D) program and back
Let we have atoms 1,2,3,4
and RDFs g11, g12, g13, g14,...
RISM-MOL order is:
g11, g12, g13, g14, g22, g23, g24, g33, g34, g44
AmberTools order is:
g11, g12, g22, g13, g23, g33, g14, g24, g34, g44
The orders can be more understandable, if we write the RDFs in the table:
1 2 3 4
1 g11 g12 g13 g14
2 g22 g23 g24
3 g33 g34
4 g44
In RISM-MOL order (rdf file) the RDFs are presented row-by row
In AmberTools (gvv file) order - column by column
These script generate the gawk command to reorder the RDFs and then - run it...
USAGE: ./gvv2rismmolRDFs mol.gvv mol.rdf
USAGE: ./rismmolRDFs2gvv mol.rdf mol.gvv
amb2gmx.pl
http://ffamber.cnsm.csulb.edu/ffamber-tools.htmlThe perl script converts the Amber topology file (prepared by Antechamber and Leap) into Gromacs topology file. See more details in the http://ffamber.cnsm.csulb.edu.
USAGE:
>>./amb2gmx.pl --prmtop amber-prmtop-filename --crd amber-crd-filename --outname outname [--debug] [--help]
Extract coordinates from Gaussian output
Find below the script for extracting coordinates from Gaussian output.It takes 2 arguments:
1) name of molecule (name of log file without extension)
2) number of atoms in the molecule
it prints to the standard output coordinates in xyz format.
Let's assume,that filename of the script is: g03log2xyz
Then usage is:
Bellow is text of the script:
#!/bin/bash
name=$1
N=$2
N2=$(expr $N + 2)
echo $N
echo $name
cat $name.log| grep GINC-TRITON -A 200 | grep \\\\\@ -B 200 | tr -d $'\n ' | tr '\\,' $'\n ' | grep Version -B 200 | tail -n-$N2 | head -n+$N > $name.tmp
M=$(cat $name.tmp | wc -w)
if [ $M == $(expr $N \* 4) ]; then
cat $name.tmp
else
cat $name.tmp | gawk '{ print $1" "$3" "$4" "$5 }'
fi
ffld_server2towhee
The script converts the force field created by the ffld_server of the Schrodinger Maestro utility to MCCCS Towheeinout files. See the end of the scripts to set the input and output file names.
Fitting and extracting 1D-RISM
function [err,mue,muc,vuac,mucSDC,b]=FitSetWithSDC(DSP,Set,SetName,MM,mu_exp,X,b,key)% EXAMPLE USAGE
%%% [err2,mue2,muc2,vuac2,mucSDC2,b2]=FitSetWithSDC(DataSetsPath,Names2,'JPCB_3DRISM_FROLOV','GF',mu_exp,X_TeS,b1,'CALC');
% Set = cell array with the nanes of compunds
% SetName - how it is called in DataSets directory
% MM - MuMethod
% mu_exp - structure array with the experimental mu data
% X - the table with descriptors (WITHOUT a0 and a1 !!!!!!!)
% b - list with SDC coefficients
% key='FIT' (to fit the data and out the coeeficients in b) or key='CALC' (use the coefficients from b)
rho=0.0333;
DataSetsPath = DSP;
[DU,EU]=unit_names;
%SetName='JPCB_3DRISM_FROLOV';
%MM='GF';
%%% LOAD CALCULATED MU DATA
P = getDataSets(DataSetsPath,SetName,'mu');
[FL,KL,KFL,PL]=multiSelect(P,'MuMethod',MM);
R=do_something('R=load(filename);',FL,KL,KFL,PL);
a=showResult(KL,R,'Name');
mu_calc = CellNumberArray2MatArray( CellMart2CellColumn(a,2));
nam_calc = CellMart2CellColumn(a,1);
%%% LOAD CALCULATED PMV DATA
P = getDataSets(DataSetsPath,SetName,'VUA');
[FL,KL,KFL,PL]=multiSelect(P);
R=do_something('R=load(filename);',FL,KL,KFL,PL);
a=showResult(KL,R,'Name');
vua_calc = CellNumberArray2MatArray( CellMart2CellColumn(a,2));
nam_calc_vua = CellMart2CellColumn(a,1);
%%% LOADING THE DATA FOR THE SPECIFIED SET
mue=[]; muc=[]; vuac=[];
N = length(Set);
for i = 1:N
name=Set{i};
sname=alphanumeric(name);
try
mue(i)=mu_exp.(sname)*unit2unit(EU,'kJ/mol','kcal/mol');
catch exception
sname = [sname '_'];
mue(i)=mu_exp.(sname)*unit2unit(EU,'kJ/mol','kcal/mol');
end
if strcmp(name,'nonan-5-one') ; name='nonan-5-on'; end
[val,ind1]=intersect(nam_calc,name);
if isempty(val)
[val,ind1]=intersect(nam_calc,[ upper(name(1)) name(2:end)]);
tmp='Upper case !!!!'
val
end
if (length(ind1) ~= 1) ; tmp='LENGTH(IND1) ~= 1 WARNING!!!!!!'
name
end
muc(i)=mu_calc(ind1);
[val,ind2]=intersect(nam_calc_vua,name);
if isempty(val)
[val,ind2]=intersect(nam_calc_vua,[ upper(name(1)) name(2:end)]);
tmp='Upper case (VUA)!!!!'
end
vuac(i)=vua_calc(ind2);
if (ind1 ~= ind2) ; tmp='IND1 ~= IND2 WARNING!!!!!!'
end
end
%%% RUN FIT
vuac=vuac'; mue=mue'; muc=muc';
m=size(Set);
D0=ones(m); % free coefficient
D1=vuac.*rho; % PMV
X=[D0 D1 X];
D2=X(:,3); D3=X(:,4); D4=X(:,5); D5=X(:,6); D6=X(:,7); D7=X(:,8); D8=X(:,9); D9=X(:,10); D10=X(:,11);
Y=mue-muc;
if strcmp(key,'FIT')
tmp='Performing FIT!!!, b is calculated from regress'
b=[];
b=regress(Y,X);
elseif strcmp(key,'CALC')
tmp='Performing CALC!!!, b is taken as an input'
else
tmp='key must be FIT or CALC, exiting !!!'
err=struct; mucSDC=[];
return
end
a0=b(1); a1=b(2); a2=b(3); a3=b(4); a4=b(5); a5=b(6); a6=b(7); a7=b(8); a8=b(9); a9=b(10); a10=b(11);
mucSDC=muc+a0+a1*D1+a2*D2+a3*D3+a4*D4+a5*D5+a6*D6+a7*D7+a8*D8+a9*D9+a10*D10;
err=struct;
err.mean_SDC=mean(mucSDC - mue);
err.std_SDC=std(mucSDC-mue,1);
err.rms=sqrt(mean((mucSDC-mue).^2));
err.mae=max(abs(mucSDC-mue));
R = corrcoef(mucSDC,mue); err.R=R(1,2);
Separate a set of names to classes
NEW VERSION:
svn export http://triton.mis.mpg.de/svn/fedorov/Development/RISM-MODULES/five_curve/Ratkova/separClasses.m
OLD VERSION:
function name_dist(Names)
% For classes: alkane, chloroalkane, alkylbenzene, chlorobenzene
m=size(Names);
CLASSES={};
alkanes={}; chloroalkanes={};
alkylbenzenes={}; chlorobenzenes={};
for i=1:m
name=Names(i);
name=cell2mat(name);
if sum(ismember('ane',name))==3
if sum(ismember('chloro',name))==6
class='chloroalkane';
chloroalkanes=[chloroalkanes,{name}];
else
class='alkane';
alkanes=[alkanes,{name}];
end
elseif (sum(ismember('benzene',name))==7 || sum(ismember('toluene',name))==7 || sum(ismember('xylene',name))==6)
if sum(ismember('chloro',name))==6
class='chlorobenzene';
chlorobenzenes=[chlorobenzenes,{name}];
else
class='alkylbenzene';
alkylbenzenes=[alkylbenzenes,{name}];
end
else
a=0
end
CLASSES=[CLASSES;{name class}];
end
CLASSES;
fprintf('alkanes')
alkanes;
s1=size(alkanes)
fprintf('chloroalkanes')
chloroalkanes;
s2=size(chloroalkanes)
fprintf('alkylbenzenes')
alkylbenzenes;
s3=size(alkylbenzenes)
fprintf('chlorobenzenes')
chlorobenzenes;
s4=size(chlorobenzenes)
S=s1+s2+s3+s4
end
Assign OPLS-AA (2005) FF parameters
$SCHRODINGER/utilities/ffld_server -ipdb pdb_name -print_parameters
To parse the ffld_server output use this script:
See also: Force fields
Generate Amber Topology
The script is aimed to generate a topology file for a molecule using AmberTools. The script takes a pdb file as an input, runs antechamber and tleap.Bibtex file with entries which are used in Texfile
If you have tex document and a general big Bib file for it, then probably you want to generate a smaller Bib file with the Bib entries which are only used by your Tex file.
The following two scripts do that:
The scripts use grep, sed, awk text processors.
generate dependencies for C++ file ]]
Below is the script, which generates dependencies for the C++ file (search for includes recursively and gives the full list of c,cpp files for the particular one file)
extract includes is a helper script
Usage:
./genRequireList a.cpp > a_require.cpp
After that a_require.cpp have the lines like #include "b.cpp" etc...
one may compile
g++ -o a a.cpp a_require.cpp
wrtie many one-column files in multi column format]]
How to use:
- download multicol.c
- compile:
gcc -o multicol multicol.c
- Use:
./multicol File1 File2 ... FileN
Updated version (in python)
- do not need to be compilated
- automatically alignes the width of columns
Calculate potential around the spherical solute
Calculates potential of the solute atoms in the water solution
Parameters:
r - vector with distance samples
g - g_sa(r): RDF functoions of all atoms.
columns 1,3,5,...2s-1,... are rdfs of s-th solute atom with water oxygen
columns 2,4,6,...,2s,... are rdfs of s-th solute atom with
water hydrogen
density - number density of a solute ( in Bohr^-3)
charge_o - partial charge of water oxygen (without a sign)
Usage: Phi=calc_Phi(r,g,density,charge_o)
convert matrix element by element to the cell array
Usage: C=celificate(A)
Converts matix A to the cell matrix C of the same size, elements left the same but become cells
Convert matlab cell array to excel compatible *.csv file
Usage: cell2csv(C,csv_fname)
Write the cell matrix C to the the coma separated values file csv_fname.
CSV file can be opened with Excel or Open Office Calc.
Concatanate cell matrix and Real matrix in matlab
M=cellcat2(Cell, Real)
Horizantally concatanates cell-matrix Cell and real-matrix Real.
Matrices should have the same number of rows
INPUT PARAMETERS:
Cell - cell matix
Real - real matrix
OUTPUT PARAMETER:
M - cell matrix: result of concatanation.
If Cell has m row n cols, Real has m row k
cols, M has m rows n+k cols. First n cols of M are the same as
Cell, last k columns are columns from Real
Save 3D data array to the Gaussian Cube file
Usage: cube2file(CUB,x0,y0,z0,dx,dy,dz,fname,NXYZ)
Save 3D data array to the Gaussian Cube file
INPUT PARAMETERS:
CUB - 3D matlab array with data values
x0,y0,z0 - coordiantes of the corner of the cube
dx,dy,dz - grid steps in each direction
fname - name of the output file
NXYZ - coordinates of the atoms
first column - atom number in the periodic table
last three - atom coordinates
Transform point or position from data space
dsxy2figxy -- Transform point or position from data space
coordinates into normalized figure coordinates
Transforms [x y] or [x y width height] vectors from data space
coordinates to normalized figure coordinates in order to locate
annotation objects within a figure. These objects are: arrow,
doublearrow, textarrow, ellipse line, rectangle, textbox
Syntax:
[figx figy] = dsxy2figxy([x1 y1],[x2 y2]) % GCA is used
figpos = dsxy2figxy([x1 y1 width height])
[figx figy] = dsxy2figxy(axes_handle, [x1 y1],[x2 y2])
figpos = dsxy2figxy(axes_handle, [x1 y1 width height])
Usage: Obtain a position on a plot in data space and
apply this function to locate an annotation there, e.g.,
[axx axy] = ginput(2); (input is in data space)
[figx figy] = dsxy2figxy(gca, axx, axy); (now in figure space)
har = annotation('textarrow',figx,figy);
set(har,'String',['(' num2str(axx(2)) ',' num2str(axy(2)) ')'])
Copyright 2006-2009 The MathWorks, Inc.
Convert the number to the tex-comatible string format with 10^ notation
Convert the number to the tex-comatible string format with 10^ notation
Usage: strV=fine_num(V)
Make Badges for a workshop
Below is example script, which makes badges for the DUEL 2011 workshop.
How it works?
It takes txt files where Name, surname, affiliation is listed, devided by ';'
Fist column - title,
Second - Surname
Third - Given Name
Last - affiliation
Also it takes a "preambule" - part of tex document,
compiles the names and affiliation to the tex file which represents badges
And then - compiles this text file and produces PDF.
All necessary files are below:
Main Script (just run it)
Script which does everything
Auxilarily files:
Input data (name surname affiliation)
Print the time in seconds in HMS (hour-minute-second) format
Please find the Matlab Script below.
Usage is evident:
printHMS(TimeInSeconds)
TimeInSeconds can also be a matrix.
Plot data and labels for it
[plothnd,texthnd]=plotWithDistance(t,x,y)
plot plot(x,y,'.') with text labels from r near each point
INPUT ARGUMENTS:
t - labels (can be numeric)
x - x(t)
y - y(t)
OUTPUT:
plothnd - handler of points on the plot
texthnd - handler of text labels
Create the Path with all sub folders
create folder. If the path does not exists - creates the path
Example:
./createPath ~/A/B/C/D
will act like following:
mkdir ~/A
mkdir ~/A/B
mkdir ~/A/B/C
mkdir ~/
Moves from one folder to another with Filter, preserving the folder structure
Usage: ./moveSubFolders Path1 Path2 Folder Filter
moves all the files from Path1/Folder and its sub folders which satisies the Filter (grep Filter) to the Path2/Folde
the folder structure is preserved
Example:
Let you have the following folder structure:
~
- P1
- - A
- - - 1
- - - - x.txt
- - - - y.txt
- - - - z.dat
- - - 2
- - - - a.txt
- - - - x.dat
- - - - z.dat
- P2
The command
./moveSubFolders ~/P1 ~/P2 A [.]dat
will create P2/A and move there all dat files, preserving the folder structure, e.g. the result will be
~
- P1
- - A
- - - 1
- - - - x.txt
- - - - y.txt
- - - - z.dat
- - - 2
- - - - a.txt
- - - - x.dat
- - - - z.dat
- P2
- - A
- - - 1
- - - - z.dat
- - - 2
- - - - x.dat
- - - - z.dat
This script is useful if you for example want to move the RDFs obtained from the MD simulations to other destination, if all these RDFs are in different folders...
read PDB file from C
http://www.koders.com/c/fid0156219555C399184E53189E067753FE0F8A1011.aspx
rotate the 3D molecule by Euler angles
[Rot,XYZ1]=rotateCoors(EulerAngles,XYZ)
Rotate coordinates by Euler Angles
INPUT PARAMETERS:
EulerAngles - euler angles [ psi, theta, phi ]
XYZ - coordinates of the molecule, each COLUMN - coordinates of the
atom ( not each ROW!, be careful)
OUTPUT PARAMETERS:
Rot - rotation matrix. To obtain ne coordinates : X'=Rot*X
XYZ1 - new coordinates XYZ1 = Rot*XYZ
Duplicate a cubic water box in order to obtain a HUGE water box (> 1M atoms)
Initial configuration including 1024 SPC/E water molecules in a cubic box
Coloring molecules by any quantity (A VMD Tcl script)
A VMD Tcl script to color molecules by any quantity
in this special case the dihedral angle is choosen and directly calculated by vmd
to start with the following line within Tcl console
source vmd_color_moldules_by_dihedralangle.tcl
to start with the following line from terminal
vmd -e vmd_color_moldules_by_dihedralangle.tcl
Every molecule can get a specific beta value (B-factor, Temperature factor, see http://proteopedia.org/wiki/index.php/Disorder )
The value ranges from 0 to 1 and is connected with a specific color representation in vmd (Beta)
0 ... 0.25 ... 0.5 ... 0.75 ... 1
red ... pink ... white ... violet ... blue
Example output:
Java program to log your activity during the day
Starts every 15 minutes, asking what are you doing and writes resuts to log file.
HOW TO INSTALL:
create the folder ~/TaskLogger (cannot be changed, to change you need to modify taskLogger script)
Dowload the following files to this folder:
run taskLogger script.
HINT: you may add the following lines to your ~/.bashrc
TaskLogger=$( ps -Af | grep taskLogger | grep -v grep | wc -l )
if [ $TaskLogger == 0 ]; then
~/TaskLogger/taskLogger &
fi
The program creates the log file with the folloowing name:
~/TaskLogger/TaskLog_[DAY][MONTH][YEAR].log
To convert the log to the report with the total amount of working time you may use the following script:
Usage: ./logReader LOGFILE REPORTFILE
Installation:
Step 0: Install RISM-MOL-Tools (www.compchemmpi.wikispaces.com/RISM-MOL-Tools)
Step 1: download the files:
get_mol_classes
sigmas.dat
dummy.ff
mol_classes.ff
Usage: get_mol_classes molecule.pdb [Optional Path to sigmas.dat dummy.ff mol_classes]
Example:
get_mol_classes 2_ethoxyethanol.pdb
Output:
Class: polyfragment
Chlorinated: no
------------ Components ------------
DoubleBond: 0
TripleBond: 0
Benzyl: 0
Aldehyde: 0
Alcohol: 1
Ketone: 0
Carboxyl: 0
Ether: 1
Clorine: 0
Iterate Over Lists
The file iterate_over_lists.py provides the basic interface for iterate over multiple iterators specified in the pyhton list variables.
To use the interface you can use the command:
from iterate_over_lists import *
the main function in the file is iterate_over_lists.
Please find below the description of this function:
def iterate_over_lists(variables,lists,fn,fn_param={})
Here meaning of the parameters is the following:
variables: names of the iterators
lists: the values of the iterators
fn: the function which takes one argument (data)
fn_param: arbitrary function params. Dictonary which will contain additional data available for the iteration function
the function fn will be called for all combinations of the iterator values from the lists.
The values of the iteratiors are passed to the function fn using the dictionary data.
The keys of the dictionary are the same as the variables parameter passed to the iterate_over_lists function.
Also, additional parameters will be set in the data dictionary to be the same as fn_params.
For more information see iterate_over_lists_readme
maximum / minimum of multidimensional array
idx = maxn(S)
computes the maximum position in multidimensional array S
[#digitalizeRDF]] | restore plot samples from image
Restore plot samples from image